[SPARK-3181][MLLIB]: Add Robust Regression Algorithm with Huber Estimator#8013
[SPARK-3181][MLLIB]: Add Robust Regression Algorithm with Huber Estimator#8013fjiang6 wants to merge 1 commit into
Conversation
|
Test build #40111 has finished for PR 8013 at commit
|
|
Still a lot of duplication. We're adding new features into LiR now, and it will be hard to maintain. Is it possible that you just add the objective function, and use Params to switch between different objective function? Thanks. |
|
Test build #40222 has finished for PR 8013 at commit
|
|
Test build #40223 has finished for PR 8013 at commit
|
|
@dbtsai ust added the objective function, and use Params to switch between different objective function. Thanks! |
There was a problem hiding this comment.
sharedParams.scala can not be edited directly. Please look at SharedParamsCodeGen.scala.
There was a problem hiding this comment.
Also, make HasRobust as HasRobustRegression in SharedParamsCodeGen.scala.
|
Test build #40530 has finished for PR 8013 at commit
|
|
This class was not added by me. I didn't touch PySpark. |
|
Test build #41422 has finished for PR 8013 at commit
|
|
Test build #41421 has finished for PR 8013 at commit
|
|
Test build #41423 has finished for PR 8013 at commit
|
There was a problem hiding this comment.
I think it would be better to introduce a "costFunction" param which defaults to "LeastSquares" and pattern match in LinearRegression#L195 since that will force mutual exclusivity when more than two cost functions are possible
|
Test build #41662 has finished for PR 8013 at commit
|
There was a problem hiding this comment.
nit: This is not really an "Option", can we just make this say " Set whether to use robust Huber Cost Function"
|
There is a lot of code repetition between this and #2096, perhaps you can make the |
|
Hello, robust tuning parameter k should not be a constant as you implemented. |
add the objective function, and use Params to switch edit to pass scala style tests make HasRobustRegression in SharedParamsCodeGen.scala, Make the document more explicitly and make k tunable and default to 1.345 by having another param UnitTests with Outliers UnitTests with Outliers Edit HuberAggregator scala codestyle Update LinearRegression.scala
e447623 to
01601ee
Compare
|
Test build #49555 has finished for PR 8013 at commit
|
|
Thanks for the pull request. I'm going through a list of pull requests to cut them down since the sheer number is breaking some of the tooling we have. Due to lack of activity on this pull request, I'm going to push a commit to close it. Feel free to reopen it or create a new one. We can also continue the discussion on the JIRA ticket. @dbtsai there are a few pull requests that were waiting on your review. Can you revisit them even if they are closed? |
Huber Robust Regression under spark/ml/regression
Unit Tests